Scalable Sequential Spectral Clustering
نویسندگان
چکیده
In the past decades, Spectral Clustering (SC) has become one of the most effective clustering approaches. Although it has been widely used, one significant drawback of SC is its expensive computation cost. Many efforts have been devoted to accelerating SC algorithms and promising results have been achieved. However, most of the existing algorithms rely on the assumption that data can be stored in the computer memory. When data cannot fit in the memory, these algorithms will suffer severe performance degradations. In order to overcome this issue, we propose a novel sequential SC algorithm for tackling large-scale clustering with limited computational resources, e.g., memory. We begin with investigating an effective way of approximating the graph affinity matrix via leveraging a bipartite graph. Then we choose a smart graph construction and optimization strategy to avoid random access to data. These efforts lead to an efficient SC algorithm whose memory usage is independent of the number of input data points. Extensive experiments carried out on large datasets demonstrate that the proposed sequential SC algorithm is up to a thousand times faster than the state-of-thearts.
منابع مشابه
Fast Spectral Clustering of Data Using Sequential Matrix Compression
Spectral clustering has attracted much research interest in recent years since it can yield impressively good clustering results. Traditional spectral clustering algorithms first solve an eigenvalue decomposition problem to get the low-dimensional embedding of the data points, and then apply some heuristic methods such as k-means to get the desired clusters. However, eigenvalue decomposition is...
متن کاملScalable Spectral Clustering with Weighted PageRank
In this paper, we propose an accelerated spectral clustering method, using a landmark selection strategy. According to the weighted PageRank algorithm, the most important nodes of the data affinity graph are selected as landmarks. The selected landmarks are provided to a landmark spectral clustering technique to achieve scalable and accurate clustering. In our experiments with two benchmark fac...
متن کاملLandmark selection for spectral clustering based on Weighted PageRank
Spectral clustering methods have various real-world applications, such as face recognition, community detection, protein sequences clustering etc. Although spectral clustering methods can detect arbitrary shaped clusters, resulting thus in high clustering accuracy, the heavy computational cost limits their scalability. In this paper, we propose an accelerated spectral clustering method based on...
متن کاملTowards Scalable Spectral Clustering via Spectrum-Preserving Sparsification
The eigendeomposition of nearest-neighbor (NN) graph Laplacian matrices is the main computational bottleneck in spectral clustering. In this work, we introduce a highly-scalable, spectrum-preserving graph sparsification algorithm that enables to build ultra-sparse NN (u-NN) graphs with guaranteed preservation of the original graph spectrums, such as the first few eigenvectors of the original gr...
متن کاملA Novel Clustering Algorithm Based on Bayesian Sequential Partition and Its Application in Image Segmentation
In this work, we propose a novel clustering algorithm based on Bayesian Sequential Partition (BSP) and the spectral clustering algorithm. Since the BSP is capable of providing much more accurate density estimates when the sample space is of moderate to high dimension, the proposed clustering algorithm is believed to be superior to available ones when dealing with high dimensional problems. To d...
متن کامل